perm filename 0[0,BGB]3 blob sn#028572 filedate 1973-03-14 generic text, type T, neo UTF8
00100	DRAFT THESIS OUTLINE.					DECEMBER 1972
00200	
00300	                          GEOMETRIC VISION
00400	                      - draft thesis outline -
00500	
00600	                           B. G. Baumgart
00700	
00800	
00900	ABSTRACT:
01000	
01100		This thesis is about a computer  vision  system  based  on  a
01200	geometric  model  of the objects being viewed. In
01300	principle, this vision system is simply a process that can be applied
01400	to  a  reel  of  video  tape to compute blueprints and geodetic maps.
01500	Applications of this system to object recognition, scene analysis and
01600	robot vehicle control are demonstrated.
01700	
01800	
01900	CONTENTS:
02000	
02100		I. MEMORY.
02200	
02300		   A. 	Representation of a Geometric Mental Universe.
02400		   B.	Region-Edge Image Representation.
02500		   C.	Semantic, Feature and Predicate Representation.
02600	
02700		II. PROCESS.
02800	
02900		   A.	Image Prediction.
03000		   B.	Image Perception.
03100		   C.	Image Comparison.
03200		   D.	Camera Locus Solution.
03300		   E.	World Model Modification.
03400			   1.	delete object from map.
03500			   2.	add known object to map. (recognition).
03600			   3. 	add or alter object in dictionary.
03700	
03800		III. APPLICATION.
03900	
04000		   A.	Blocks and Block Scenes.
04100			   1. deletion of a block from a scene.
04200			   2. addition of blocks to a scene.
04300		   B.	Tools and Table Top Scenes.
04400			   1. complicated object perception.
04500			   2. known object recognition.
04600		   C.	A Robot Vehicle and Outdoor Scenes.
04700			   1. known road servoing.
04800			   2. landscape perception.
     

00100	I. MEMORY STRUCTURE.
00200	
00300		In order to get a computer to deal with the physical world it
00400	must  have  a  data  representation  on  which computations involving
00500	space, time, shape, size and the appearance of things can be done. In
00600	this  section,  a  representation  for  the  topology,  geometry  and
00700	photometry of everyday things is  explained.  The  data
00800	structures  discussed  are  implemented  as  small  blocks  of  words
00900	containing pointers and data in the fashion  usual  to  graphics  and
01000	simulation;  an introduction to this technology can be found in Knuth
01100	[1]; and although the language of implementation  is  PDP-10  machine
01200	code,  the  data  and  functions  presented below are accessible from
01300	higher level languages like LISP and ALGOL.
01400	
01500	I.A. Representation of a Geometric Mental Universe.
01600	
01700		At the top of the data structure is a  single  universe  node
01800	from  which  everything  else can be reached.   Immediately below the
01900	universe node is a ring  of  world  models.   A  robot  dealing  with
02000	physical world sensor input, such as video data, has one of its world
02100	models dedicated to simulating  the  immediate  here  and  now;  this
02200	mental  world  is  called the reality world model. In addition to the
02300	reality world, a robot may have  fantasy  world  models  for  problem
02400	solving, planning or for recalling platonic object prototypes. In the
02500	following, a two world mental universe will be the most common,  with
02600	the  reality world being referred to as a "map" and the fantasy world
02700	being referred to as a "dictionary".
02800	
02900		Geometric world models have four  basic  kinds  of  nodes:
03000	body, face, edge and vertex. The face, edge and vertex nodes are used
03100	to form polyhedrons which may be attached to body nodes.  Body  nodes
03200	in  turn  are  connected  to  each other in rings and trees to form a
03300	world model. Additional kinds of nodes  discribe  cameras  and  light
03400	sources  as  well  as  temporary  data  such  as shadows, spines, and
03500	trajectories.
03600	
03700		...continuation of this section follows AIM-179,
03800		"Winged Edge Polyhedron Representation" - Baumgart.
     

00100	I.B. Region-Edge Image Representation.
00200	
00300		The image data structure  presented  in  this  section  is  a
00400	computer's  internal  notation  for  what  is  vulgarly called a line
00500	drawing; the common term is misleading because it  does  not  suggest
00600	the  equally  important  space between the lines; terms closer to the
00700	idea would be "mosaic drawing" or "stained glass window drawing".
00800	
00900	The  data  structure  has  main  levels:  TV  raster,  video
01000	intensity contour, arc contour, and region-edge.
01100		...continuation of this section follows SAILON-71,
01200		"CART'S EYE THREE and its IMAGE REPRESENTATION" - Baumgart.
01300	
01400	
     

00100	II. PROCESS.
00200	
00300	   A.	Image Prediction.
00400	   B.	Image Perception.
00500	   C.	Image Comparison.
00600	   D.	Camera Locus Solution.
00700	   E.	World Model Modification.
00800		   1.	delete object from map.
00900		   2.	add known object to map. (recognition).
01000		   3. 	add or alter object in dictionary.
01100	
01200	III. APPLICATION.
01300	
01400	   A.	Block Scenes.
01500		   1. deletion of a block from a scene.
01600		   2. addition of blocks to a scene.
01700	   B.	Tools and things.
01800		   1. complicated object perception.
01900		   2. known object recognition.
02000	   C.	Robot Vehicle.
02100		   1. known road servoing.
02200		   2. landscape perception.